subset sum
Optimal Lottery Tickets via Subset Sum: Logarithmic Over-Parameterization is Sufficient
A recent work by Malach et al. [MYSS20] establishes the first theoretical analysis for the strong LTH: one can provably approximate a neural network of width $d$ and depth $l$, by pruning a random one that is a factor $O(d^4 l^2)$ wider and twice as deep. This polynomial over-parameterization requirement is at odds with recent experimental research that achieves good approximation with networks that are a small factor wider than the target. In this work, we close the gap and offer an exponential improvement to the over-parameterization requirement for the existence of lottery tickets. We show that any target network of width $d$ and depth $l$ can be approximated by pruning a random network that is a factor $O(log(dl))$ wider and twice as deep. Our analysis heavily relies on connecting pruning random ReLU networks to random instances of the Subset Sum problem. We then show that this logarithmic over-parameterization is essentially optimal for constant depth networks. Finally, we verify several of our theoretical insights with experiments.
Review for NeurIPS paper: Explaining Naive Bayes and Other Linear Classifiers with Polynomial Time and Delay
Additional Feedback: It would be interesting to see a discussion of how this work lies in comparison to classes of knowledge bases that enable tractable abductive reasoning [1]. For example, is this result a special case of some known class/language? I just wanted to address the author's request for specific references "that might cast doubt on the novelty of our work". Sorry for not being more concrete, but here are some specific references. David Eppstein The polynomial time enumeration algorithm proposed for Eq 16 is basically subset sum where we enumerate all subsets that sum less than some threshold.
Review for NeurIPS paper: Optimal Lottery Tickets via Subset Sum: Logarithmic Over-Parameterization is Sufficient
Summary and Contributions: This paper considers the strong lottery ticket hypothesis -- a conjecture that a randomly initialized neural network can be pruned (i.e. The authors show that when the target function is a fully-connected neural network, such a pruning will exist with high probability whenever the randomly initialized network has twice as many layers and has a width that is a log(d*l/epsilon) factor larger than the target network. Here, d is the width of the target network, l is its depth, and epsilon is the desired accuracy. This is an improvement over the best/only known result (Malach et al. 2020) on this problem that showed that this can be achieved with a width that is a poly(d, l, 1/epsilon) factor larger than the target. The improvement is achieved by essentially reusing the proof of Malach et al., but fixing a key step where polynomial factors were lost by appealing to known results on random subset sum problems.
Optimal Lottery Tickets via Subset Sum: Logarithmic Over-Parameterization is Sufficient
A recent work by Malach et al. [MYSS20] establishes the first theoretical analysis for the strong LTH: one can provably approximate a neural network of width d and depth l, by pruning a random one that is a factor O(d 4 l 2) wider and twice as deep. This polynomial over-parameterization requirement is at odds with recent experimental research that achieves good approximation with networks that are a small factor wider than the target. In this work, we close the gap and offer an exponential improvement to the over-parameterization requirement for the existence of lottery tickets. We show that any target network of width d and depth l can be approximated by pruning a random network that is a factor O(log(dl)) wider and twice as deep. Our analysis heavily relies on connecting pruning random ReLU networks to random instances of the Subset Sum problem.
Choice Set Optimization Under Discrete Choice Models of Group Decisions
Tomlinson, Kiran, Benson, Austin R.
The way that people make choices or exhibit preferences can be strongly affected by the set of available alternatives, often called the choice set. Furthermore, there are usually heterogeneous preferences, either at an individual level within small groups or within sub-populations of large groups. Given the availability of choice data, there are now many models that capture this behavior in order to make effective predictions. However, there is little work in understanding how directly changing the choice set can be used to influence a group's preferences or decisions. Here, we use discrete choice modeling to develop an optimization framework of such interventions for several problems of group influence, including maximizing agreement or disagreement and promoting a particular choice. We show that these problems are NP-hard in general but imposing restrictions reveals a fundamental boundary: promoting an item is easier than maximizing agreement or disagreement. After, we design approximation algorithms for the hard problems and show that they work extremely well for real-world choice data.
On the Constrained Least-cost Tour Problem
O'Hara, Patrick, Ramanujan, M. S., Damoulas, Theodoros
We introduce the Constrained Least-cost Tour (CLT) problem: given an undirected graph with weight and cost functions on the edges, minimise the total cost of a tour rooted at a start vertex such that the total weight lies within a given range. CLT is related to the family of Travelling Salesman Problems with Profits, but differs by defining the weight function on edges instead of vertices, and by requiring the total weight to be within a range instead of being at least some quota. We prove CLT is $\mathcal{NP}$-hard, even in the simple case when the input graph is a path. We derive an informative lower bound by relaxing the integrality of edges and propose a heuristic motivated by this relaxation. For the case that requires the tour to be a simple cycle, we develop two heuristics which exploit Suurballe's algorithm to find low-cost, weight-feasible cycles. We demonstrate our algorithms by addressing a real-world problem that affects urban populations: finding routes that minimise air pollution exposure for walking, running and cycling in the city of London.
Optimal Rectangle Packing: An Absolute Placement Approach
We consider the problem of finding all enclosing rectangles of minimum area that can contain a given set of rectangles without overlap. Our rectangle packer chooses the x-coordinates of all the rectangles before any of the y-coordinates. We then transform the problem into a perfect-packing problem with no empty space by adding additional rectangles. To determine the y-coordinates, we branch on the different rectangles that can be placed in each empty position. Our packer allows us to extend the known solutions for a consecutive-square benchmark from 27 to 32 squares. We also introduce three new benchmarks, avoiding properties that make a benchmark easy, such as rectangles with shared dimensions. Our third benchmark consists of rectangles of increasingly high precision. To pack them efficiently, we limit the rectangles' coordinates and the bounding box dimensions to the set of subset sums of the rectangles' dimensions. Overall, our algorithms represent the current state-of-the-art for this problem, outperforming other algorithms by orders of magnitude, depending on the benchmark.
Optimal Packing of High-Precision Rectangles
Huang, Eric (Palo Alto Research Center) | Korf, Richard E. (University of California, Los Angeles)
The rectangle-packing problem consists of finding an enclosing rectangle of smallest area that can contain a given set of rectangles without overlap. Our new benchmark includes rectangles of successively higher precision, challenging the previous state-of-the-art, which enumerates all locations for placing rectangles, as well as all bounding box widths and heights up to the optimal box. We instead limit the rectangles’ coordinates and bounding box dimensions to the set of subset sums of the rectangles’ dimensions. We also dynamically prune values by learning from infeasible subtrees and constrain the problem by replacing rectangles and empty space with larger rectangles. Compared to the previous state-of-the-art, we test 4,500 times fewer bounding boxes on the high-precision benchmark and solve N =9 over two orders of magnitude faster. We also present all optimal solutions up to N =15, which requires 39 bits of precision to solve. Finally, on the open problem of whether or not one can pack a particular infinite series of rectangles into the unit square, we pack the first 50,000 such rectangles witha greedy heuristic and conjecture that the entire infinite series can fit..
Heuristic Algorithms for Balanced Multi-Way Number Partitioning
Zhang, Jilian (Singapore Management University) | Mouratidis, Kyriakos (Singapore Management University) | Pang, HweeHwa (Singapore Management University)
Balanced multi-way number partitioning (BMNP) seeks to split a collection of numbers into subsets with (roughly) the same cardinality and subset sum. The problem is NP-hard, and there are several exact and approximate algorithms for it. However, existing exact algorithms solve only the simpler, balanced two-way number partitioning variant, whereas the most effective approximate algorithm, BLDM, may produce widely varying subset sums. In this paper, we introduce the LRM algorithm that lowers the expected spread in subset sums to one third that of BLDM for uniformly distributed numbers and odd subset cardinalities. We also propose Meld, a novel strategy for skewed number distributions. A combination of LRM and Meld leads to a heuristic technique that consistently achieves a narrower spread of subset sums than BLDM.
A Hybrid Recursive Multi-Way Number Partitioning Algorithm
Korf, Richard Earl (University of California, Los Angeles)
The number partitioning problem is to divide a given set of N positive integers into K subsets, so that the sum of the numbers in each subset are as nearly equal as possible. While effective algorithms for two-way partitioning exist, multi-way partitioning is much more challenging. We introduce an improved algorithm for optimal multi-way partitioning, by combining several existing algorithms with some new extensions. We test our algorithm for partitioning 31-bit integers from three to ten ways, and demonstrate orders of magnitude speedup over the previous state of the art.